Correcting Show-Through Effects on Document Images by Multiscale Analysis
نویسندگان
چکیده
This paper describes a new approach to restoring color document images where the backside image shows through the paper sheet. A new framework is presented for correcting show-through components using digital image processing techniques. First, the foreground components on the front side are separated from the background and backside components through locally adaptive binarization for each color component and edge magnitude thresholding. Background colors are estimated locally through color thresholding to generate a restored image, and then corrected adaptively through multi-scale analysis along with comparison of edge distributions between the original and the restored image. The proposed method is able to correct unneeded image components through analysis of the front side image alone. Experimental results are given to verify effectiveness of the proposed method.
منابع مشابه
Document Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملPersian Printed Document Analysis and Page Segmentation
This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملرفع اعوجاج هندسی متون بهکمک اطلاعات هندسی خطوط متن
Document images produced by scanners or digital cameras usually have photometric and geometric distortions. If either of these effects distorts document, recognition of words from such a document image using OCR is subject to errors. In this paper we propose a novel approach to significantly remove geometric distortion from document images. In this method first we extract document lines from do...
متن کاملParameter-Free Geometric Document Layout Analysis
ÐAutomatic transformation of paper documents into electronic documents requires geometric document layout analysis at the first stage. However, variations in character font sizes, text line spacing, and document layout structures have made it difficult to design a general-purpose document layout analysis algorithm for many years. The use of some parameters has therefore been unavoidable in prev...
متن کامل